First view of the data

Analyzing data. Showing the head and tail of the dataset

Checking the length of each series.

## [1] "Series 1 has 300 data points."
## [1] "Series 2 has 300 data points."
## [1] "Series 3 has 300 data points."
## [1] "Series 4 has 300 data points."
## [1] "Series 5 has 2000 data points."
## [1] "Series 6 has 3000 data points."
## [1] "Series 7 has 3000 data points."

Assignation of series.

series1=data[,1] [1:300]
series2=data[,2] [1:300]
series3=data[,3] [1:300]
series4=data[,4] [1:300]
series5=data[,5] [1:2000]
series6=data[,6]
series7=data[,7]

Series 1

Graphical exploration (first moment - mean)

y<-series1    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Computing basic stats

mean(y) # compute basic statistics
## [1] 0.03205
sd(y)
## [1] 1.193712
skewness(y)
## [1] -0.2795834
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 4.040454
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 0

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.98868, p-value = 0.01941

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 18.804, df = 20, p-value = 0.5346

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 14.168, df = 20, p-value = 0.8219

Series 2

Graphical exploration (first moment - mean)

y<-series2    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS (stochastic process) looks no stationary in the mean nor in the variance
- A downward trend can be observed
- Based on the ACF, data seems correlated

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- The TS does not appear normally distributed. Data is spread.

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size
par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- Based on the ACF of the second moment (variance), there seems to be correlation

Computing basic stats

mean(y) # compute basic statistics
## [1] -0.22853
sd(y)
## [1] 6.31268
skewness(y)
## [1] 0.04883792
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 1.765163
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 1

Observations:
- TS requires transformation to become stationary
- To make it stationary we need to apply the first difference

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.94921, p-value = 1.133e-08

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 4509.5, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is correlated in the mean

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 2317.4, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is correlated in the variance

Transformed data

Applying the formula without defining the difference value

# Just in case we need to take one difference to the original data (as in this case)

z<-diff(y)  
ts.plot(z)

par(mfrow=c(3,1))
ts.plot(z)   
acf(z)
pacf(z)

Observations:
- TS looks stationary

Checking for CS

ndiffs(z, alpha=0.05, test=c("adf")) 
## [1] 0

Observations:
- Confirmation that no transformation is needed - data is stationary

checking for normality (graphically)

#Checking for normality
hist(z,prob=T,ylim=c(0,0.6),xlim=c(mean(z)-3*sd(z),mean(z)+3*sd(z)),col="red")
lines(density(z),lwd=2)
mu<-mean(z)
sigma<-sd(z)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- Normality is observed

Checking for normality (formal)

shapiro.test(z)
## 
##  Shapiro-Wilk normality test
## 
## data:  z
## W = 0.99619, p-value = 0.69

Observations:
- We accept \(H_0\) since \(P_{value} > 0.05\), \(\therefore\) the TS is normally distributed

Checking for correlation (first moment - mean)

Box.test (z, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z
## X-squared = 12.452, df = 20, p-value = 0.8996

Observations:
- We accept \(H_0\) since \(P_{value} > 0.05\)
- TS is uncorrelated (not correlated) in the mean
- TS is WN
- TS is GWN since data is normally distributed
- TS is SWN since GWN implies SWN

Checking for correlation (second moment - variance)

Box.test (z^2, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z^2
## X-squared = 9.7899, df = 20, p-value = 0.9718

Observations:
- We accept \(H_0\) since \(P_{value} > 0.05\)
- TS is uncorrelated (not correlated) in the variance

Series 3

Graphical exploration (first moment - mean)

y<-series3    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS shows downward trend - TS looks no stationary - TS seems to have correlated data

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS seems not normally distributed

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- TS (second moment - variance) shows an upward trend and correlated data

Computing basic stats

mean(y) # compute basic statistics
## [1] -21.65212
sd(y)
## [1] 10.42905
skewness(y)
## [1] 0.3495993
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 2.197106
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 1

Observations:
- TS requires transformation to become stationary
- To make it stationary we need to apply the first difference

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.9645, p-value = 1.019e-06

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 3598.9, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the mean - TS is not WN, GWN or SWN - Linear model possible

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 3552, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the variance
- Non-linear model may be possible but not needed

Transformed data

Applying the formula without defining the difference value

# Just in case we need to take one difference to the original data (as in this case)

z<-diff(y)  
ts.plot(z)

par(mfrow=c(3,1))
ts.plot(z)   
acf(z)
pacf(z)

ndiffs(z, alpha=0.05, test=c("adf"))
## [1] 0
Box.test (z, lag = 20, type="Ljung") 
## 
##  Box-Ljung test
## 
## data:  z
## X-squared = 148.71, df = 20, p-value < 2.2e-16
Box.test (z^2, lag = 20, type="Ljung") 
## 
##  Box-Ljung test
## 
## data:  z^2
## X-squared = 47.582, df = 20, p-value = 0.0004868
#Checking for normality

shapiro.test(z)
## 
##  Shapiro-Wilk normality test
## 
## data:  z
## W = 0.99344, p-value = 0.2178
hist(z,prob=T,ylim=c(0,0.6),xlim=c(mean(z)-3*sd(z),mean(z)+3*sd(z)),col="red")
lines(density(z),lwd=2)
mu<-mean(z)
sigma<-sd(z)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS is stationary (constant mean and variance across time). No more transformations are needed.
- TS data is normally distributed since we accept \(H_0\) given that \(P_{value} > 0.05\)
- TS is correlated in the mean or variance since we reject \(H_0\) for both given that \(P_{value} < 0.05\), \(\therefore\) TS is not WN, GWN or SWN
- Linear model possible. Also, a non-linear one but not needed.

Series 4

Graphical exploration (first moment - mean)

y<-series4    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS shows upward trend - TS shows correlation in data

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS not normally distributed

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- TS shows an upward trend and correlated data in the variance

Computing basic stats

mean(y) # compute basic statistics
## [1] 1256.843
sd(y)
## [1] 1154.857
skewness(y)
## [1] -0.1488262
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 1.559977
## attr(,"method")
## [1] "moment"

Observations:
- bla

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 2

Observations:
- TS requires two differences to become stationary

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.91161, p-value = 2.779e-12

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 5290.9, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the mean - TS is not WN or GWN

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 4641.8, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the variance - TS is not SWN for the first and second moments. We do not know if SWN for others but we do not care.

Transformed data

Applying the formula defining the difference value = 2

# Just in case we need to take one difference to the original data (as in this case)

z<-diff(y, difference = 2)  
ts.plot(z)

par(mfrow=c(3,1))
ts.plot(z)   
acf(z)
pacf(z)

Observations:
- TS looks stationary - TS shows correlated data

Checking for CS

ndiffs(z, alpha=0.05, test=c("adf")) 
## [1] 0

Observations:
- Confirmation that no transformation is needed - data is stationary

checking for normality (graphically)

#Checking for normality
hist(z,prob=T,ylim=c(0,0.6),xlim=c(mean(z)-3*sd(z),mean(z)+3*sd(z)),col="red")
lines(density(z),lwd=2)
mu<-mean(z)
sigma<-sd(z)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- Normality is observed

Checking for normality (formal)

shapiro.test(z)
## 
##  Shapiro-Wilk normality test
## 
## data:  z
## W = 0.99438, p-value = 0.3416

Observations:
- We accept \(H_0\) since \(P_{value} > 0.05\), \(\therefore\) the TS is normally distributed

Checking for correlation (first moment - mean)

Box.test (z, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z
## X-squared = 297.76, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\)
- TS is correlated in the mean - TS is not WN, GWN or SWN

Checking for correlation (second moment - variance)

Box.test (z^2, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z^2
## X-squared = 72.7, df = 20, p-value = 6.562e-08

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\)
- TS is correlated in the variance

Series 5

Graphical exploration (first moment - mean)

y<-series5    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS seems stationary and a bit correlated

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS appears normal

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- TS (second moment) appears correlated

Computing basic stats

mean(y) # compute basic statistics
## [1] 0.0071755
sd(y)
## [1] 0.6797896
skewness(y)
## [1] -0.08744834
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 4.36924
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 0

Observations:
- Confirmation of CS

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.99047, p-value = 3.456e-10

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 33.151, df = 20, p-value = 0.03248

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the mean
- No WN, GWN or SWN

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 1061.5, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the variance

Series 6

Graphical exploration (first moment - mean)

y<-series6    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS seems stationary and correlated

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS seems normally distributed

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- TS seems stationary and correlated

Computing basic stats

mean(y) # compute basic statistics
## [1] 0.007541
sd(y)
## [1] 0.7605913
skewness(y)
## [1] -0.2023789
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 5.113352
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 0

Observations:
- TS is stationary

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.98338, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 537.69, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the mean - TS is not WN, GWN or SWN - Linear model possible. Also, non-linear model possible but not needed.

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 1856.6, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the variance

Series 7

Graphical exploration (first moment - mean)

y<-series7    # from now, "y" is the data we are going to work with

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # plot the series, its acf and pacf together
ts.plot(y)   
acf(y)
pacf(y)

Observations:
- TS seems not stationary since mean is not constant across time
- TS the variance seems to be different across time
- TS has correlated data

Checking for Normality (graphically)

#Checking for normality graphically
hist(y,prob=T,ylim=c(0,0.6),xlim=c(mean(y)-3*sd(y),mean(y)+3*sd(y)),col="red")
lines(density(y),lwd=2)
mu<-mean(y)
sigma<-sd(y)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- TS seems not normally distributed

Graphical exploration (second moment - variance)

# C.    Testing for STRICT WHITE NOISE

par(mar=c(1,1,1,1)) # to adjust graphic size

par(mfrow=c(3,1)) # analysis of the squared data
ts.plot(y^2)   
acf(y^2)
pacf(y^2)

Observations:
- TS does not seem stationary in the variance - TS seems correlated

Computing basic stats

mean(y) # compute basic statistics
## [1] 10.88196
sd(y)
## [1] 10.28588
skewness(y)
## [1] -0.1702387
## attr(,"method")
## [1] "moment"
kurtosis(y,method=c("moment"))  
## [1] 2.00975
## attr(,"method")
## [1] "moment"

Checking for CS

# formal unit root test (Augmented Dickey Fuller test). Testing for stationarity.
# Ho: the process is not stationary. We need, at least, a unit root
# H1: the process is stationary. We have to check different models (lags)
ndiffs(y, alpha=0.05, test=c("adf")) # number of regular differences?
## [1] 1

Observations:
- TS needs transformation (difference operator = 1). Meaning that we need to take the first difference in order to make it stationary.

Checking for Normality

# formal normality test
# Ho: the data is normally distributed
# H1: the data is not normally distributed
shapiro.test(y)
## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.96711, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test (y, lag = 20, type="Ljung")  # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y
## X-squared = 57372, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the mean - TS is not WN, GWN or SWN
- Linear model possible

Checking for correlation in the second moment

# formal test for white noise (zero autocorrelations)
# Ho: uncorrelated data
# H1: correlated data
Box.test(y^2,lag=20, type="Ljung")    # Null: ro1=.=ro20=0
## 
##  Box-Ljung test
## 
## data:  y^2
## X-squared = 55788, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS has correlated data in the variance - Non-linear model possible but not needed.

Transformed data

Applying the formula without defining the difference value

# Just in case we need to take one difference to the original data (as in this case)

z<-diff(y)  
ts.plot(z)

par(mfrow=c(3,1))
ts.plot(z)   
acf(z)
pacf(z)

Observations:
- TS looks stationary

Checking for CS

ndiffs(z, alpha=0.05, test=c("adf")) 
## [1] 0

Observations:
- Confirmation that no transformation is needed - data is stationary

checking for normality (graphically)

#Checking for normality
hist(z,prob=T,ylim=c(0,0.6),xlim=c(mean(z)-3*sd(z),mean(z)+3*sd(z)),col="red")
lines(density(z),lwd=2)
mu<-mean(z)
sigma<-sd(z)
x<-seq(mu-3*sigma,mu+3*sigma,length=100)
yy<-dnorm(x,mu,sigma)
lines(x,yy,lwd=2,col="blue")

Observations:
- Normality is observed

Checking for normality (formal)

shapiro.test(z)
## 
##  Shapiro-Wilk normality test
## 
## data:  z
## W = 0.98584, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\), \(\therefore\) the TS is not normally distributed

Checking for correlation (first moment - mean)

Box.test (z, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z
## X-squared = 40.012, df = 20, p-value = 0.004978

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\) - TS is correlated in the mean
- TS is not WN, GWN or SWN

Checking for correlation (second moment - variance)

Box.test (z^2, lag = 20, type="Ljung")
## 
##  Box-Ljung test
## 
## data:  z^2
## X-squared = 1632.1, df = 20, p-value < 2.2e-16

Observations:
- We reject \(H_0\) since \(P_{value} < 0.05\) - TS is correlated in the variance